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.As more and more service providers choose Cloud platforms, which is provided by third party resource providers, 
"resource providers needs to provision resources for heterogeneous workloads in different Cloud scenarios. Taking into 
account the dramatic differences of heterogeneous workloads, can we coordinately provision resources for heterogeneous 
workloads in Cloud computing? In this paper we focus on this important issue, which is investigated by few previous 
work. Our contributions are threefold: (1) we respectively propose a coordinated resource provisioning solution for 
heterogeneous workloads in two typical Cloud scenarios: first, a large organization operates a private Cloud for two 
heterogeneous workloads; second, a large organization or two service providers running heterogeneous workloads revert 
to a public Cloud; (2) we build an agile system PhoenixCloud that enables a resource provider to create coordinated 
runtime environments on demand for heterogeneous workloads when they are consolidated on a Cloud site; and (3) A 
, 'comprehensive evaluation has been performed in experiments. For two typical heterogeneous workload traces: parallel 
batch jobs and Web services, our experiments show that: a) in a private Cloud scenario, when the throughput is almost 
same like that of a dedicated cluster system, our solution decreases the configuration size of a cluster by about 40%; b) 
in a public Cloud scenario, our solution decreases not only the total resource consumption, but also the peak resource 
consumption maximally to 31% with respect to that of EC2 +RightScale solution. 

[Keywords: Infrastructure Management, Cloud Computing, Heterogeneous Workloads, Coordinated Resource 
Provisioning 



1. Introduction 

, Traditionally, users tend to use a dedicated cluster sys- 
tem (in short DCS) to provide homogeneous services. Re- 
source utilization rates of DCS are varying. For un- 
expected peak loads, DCS cannot provision enough re- 
sources, while lots of resources are idle for normal loads. 
'Recently, several pioneer computing companies are adopt- 
ing infrastructure as a service (laaS). For example, as a re- 
source provider, Amazon provides elastic computing cloud 
|(EC2) services [8| to end users in order to offer outsourced 
resources in the granularity of XEN virtual machine [395 . 
A new term Cloud is used to describe this new computing 
paradigm Q 33 1 41 1. We regard that the most appropri- 
ate one is defined in [sl]. According to this definition, a 
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Cloud is a large pool of easily usable and accessible virtu- 
alized resources, which can be dynamically reconfigured to 
adjust to a variable load (scale), allowing also for optimum 
resource utilization. 

As more and more service providers choose Cloud plat- 
forms, which is provided by third party resource providers, 
a resource provider (which can be regarded as a Cloud 
infrastructure provider) needs to provision resources for 
heterogeneous workloads in Cloud computing. In this pa- 
per, we consider two representative Cloud scenario: first, 
a large organization operates a private Cloud for hetero- 
geneous workloads; second, a large organization or ser- 
vice providers running heterogeneous workloads revert to 
a public Cloud. For example, a large organization oper- 
ates two DCSs for its two affiliated departments: a batch 
queuing system for parallel batch jobs for the first de- 
partment and a Web service infrastructure for the second 
one. Is it possible for this organization to resort to a pri- 
vate Cloud or a public Cloud solution! Besides, indepen- 
dent service providers also may run different heterogeneous 
workloads. Heterogeneous workloads have different re- 
source management requirements, for example workloads 
of parallel batch jobs and Web services differ in resource 
consumption characteristics, performance goals and time 
scales of management [stII, (which we will further explain 
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in Section[31). Taking into account the dramatic differences 
of heterogeneous workloads, can we coordinately provision 
resources for heterogeneous Cloud workloads? This issue 
is the main focus of our paper. Besides, the runtime envi- 
ronment software that is responsible for managing cluster 
resources and workloads plays an important role since it 
has great impact on resource utilizations and quality of 
services of user applications. Traditional runtime environ- 
ments only support homogeneous workloads, for example, 
OpenPBS §^ for parallel batch jobs or Oceano pj] for 
web services. And hence we also need to consider another 
supporting issue in this paper: how should we design and 
implement runtime environment software that enables pro- 
visioning resources for heterogeneous workloads in differ- 
ent Cloud scenario? 

Previous work fails to resolve the above issues in two ways: 
First, previous work either devises a scheduling algorithm 
for parallel batch jobs with different resource demands 35j , 
either proposes resource allocation algorithms for virtu- 
alized service hosting platforms 47[, of which clustered 
servers run components of continuously running services, 
or presents resource management policies (implementing 
leases as virtual machines [36j ) for homogeneous workloads 
(only parallel batch jobs) mixed with best-effort lease re- 
quests and advanced reservation requests in Cloud scenar- 
ios, and hence we can not leverage existing knowledge to 
resolve coordinated resource provisioning issues for hetero- 
geneous Cloud workloads. 

Second, though previous systems can provision virtual in- 
frastructure [46*1 or hosted application environments 
12l | in private/hybrid clouds, no previous efforts pay at- 
tention to emerging requirements for coordinated resource 
provisioning for heterogeneous workloads, and no system 
enables creating coordinated runtime environments on de- 
mand. A coordinated runtime environment is the one that 
can share coordinated resources with another runtime envi- 
ronment. For example, if the large organization chooses a 
Cloud platform, two runtime environments belong to this 
condition. To the best of our knowledge, this is the first 
time that the above issue is focused on. We design and 
implement an innovative system, PhoenixCloud, to facili- 
tate a resource provider to provision coordinated runtime 
environments on demand for heterogeneous workloads in 
Cloud computing. The contributions of our paper are con- 
cluded as follows: 

(1) In two typical Cloud scenarios, we respectively propose 
a coordinated resource provisioning solution for two rep- 
resentative heterogeneous workloads (parallel batch jobs 
and Web services): first, a large organization operates a 
private Cloud for two heterogeneous workloads; second, a 
large organization or two service providers running hetero- 
geneous workloads revert to a public Cloud. 

(2) We build an innovative system PhoenixCloud to enable 
creating coordinated runtime environments for heteroge- 
neous workloads. 

(3) A comprehensive evaluation has been performed in ex- 
periments. For typical workload traces of parallel batch 



jobs and Web services, our experiments show that: a) in 
the private Cloud scenario, when the throughput is al- 
most same like that of DCS, our solution decreases the 
configuration size of cluster by about 40%; b) in the sec- 
ond Cloud scenario, our solution decreases not only the 
total resource consumption, but also the peak resource 
consumption maximally to 31% with respect to that of 
EC2 + RightScale solution. 

This paper includes seven sections. Section [2] summarizes 
the related work. Section [3] introduces several representa- 
tive runtime environment requirements. Section U] explains 
PhoenixCloud design and implementation. Section [5] pro- 
poses two policies for coordinated resource provisioning. 
Section [S] evaluates our system, and Section [7] draws the 
conclusion. 



2. Related Work 

In this section, we summarize related work of agile in- 
frastructure and resource provisioning. 

2.1. Agile Infrastructure: Description Models and Systems 

Description models: EC2 allows end users to de- 
scribe their resource requirements, e.g., virtual machines, 



and EC2's extended services - RightScale [3^ allow service 
providers to describe their Web service requirements; A. 
Keller et al [1^ propose a framework to specify service- 
level agreements for web services; A. Hoheisel et al 15 
present a framework to define both workflow and dataflow 
for job applications. F. Gallcn et al |12|| propose a service 
specification language for cloud computing platforms in 
order to facilitate interoperability among laaS clouds, and 
also address important issues such as custom automatic 
elasticity and performance monitoring. R. Buyya et al[3^ 
propose a meta-negotiation document to determine defini- 
tion and measurement of user QoS parameters. However, 
they are not qualified for describing diverse runtime en- 
vironment requirements in creating coordinated runtime 
environments on demand. 

Systems: EC2 directly provisions resources to end users. 
Without enabling the user role of service provider, EC2 
relies upon end user's manual management of resources. 
EC2 extended services: RightScale |32l|. Enomalism 
and GoGrid [l^ systems provide automated cloud com- 
puting management systems that assist you in creating 
and deploying only scalable Web service applications run- 
ning on EC2 platforms. D. Irwin et al [6] share a similar 
goal of our work by providing Shirako prototype of a ser- 
vice oriented architecture for resource providers and con- 
sumers to negotiate access to resources over time; however 
Shirako does not explicitly support service providers to 
express personalized runtime environment requirements, 
especially coordinated runtime environments for heteroge- 
neous workloads. 

M. Steinder et al 37| show that a virtual machine allows 



heterogeneous workloads to be collocated on any server 
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machine, and proposes a system architecture for manag- 
ing heterogeneous workload. However, it does not treat 
runtime environment as a first-class entity in the design. 
In our opinion, runtime environment's being a first class 
entity has three meanings: a) there is a runtime environ- 
ment specification that is qualified for expressing diverse 
runtime environment requirements; b) a runtime environ- 
ment can be created on demand according to a runtime 
environment agreement; c) there is a framework that sup- 
ports the development of a a runtime environment satis- 
fying the new requirement. R. S. Montero et al [29] pro- 
pose an architecture to provision computing elements that 
focuses on resolving the growing heterogeneity (hardware 
and software configuration) of the organizations that join 
a Grid. A. Bavier et al [4] demonstrate dynamic instan- 
tiation of distributed virtualization in a wide-area testbed 
deployment with a sizable user base, whereby each service 
runs in an isolated slice of PlanetLab's global resources. 
B. Rochwerger et al |27| pay attention to imple- 
menting an architecture that would enable providers 
of cloud infrastructure to dynamically partner with 
each other. Two open source projects, Open- 
Nebula 



( |http : //www ■ op ennebula . org/ ) and Haizea 
('http://haizea.cs.uchicago.edu/), are complemen- 
tary and can be used to manage Virtual infrastructures 
in private/hybrid clouds [45] [46]. However, those systems 
do not enable creating coordinated runtime environments 
for heterogeneous Cloud workloads. 

2.2. Resource provisioning 

M. Steinder et al JVQ only exploits a range of new au- 
tomation mechanisms that will benefit a system with a 
homogeneous, particularly non-interactive workload by al- 
lowing more effective scheduling of jobs. By considering 
a workload in which massively parallel tasks that require 
large resources pools are interleaved with short tasks that 
require fast response but consume fewer resources, M. Sil- 
berstein et al [35| devise a scheduling algorithm. In na- 
ture, they only consider parallel batch jobs with different 
resource demands. M.W. Margo et al [1^ are interested 
in metascheduling capabilities (co-scheduling for Grid ap- 
plications) in the TeraGrid system, including user-settable 
reservations among distributed cluster sites. B. Lin et al 
provide an OS-level scheduling mechanism, VSched [26l |. 
VSched enforces compute rate and interactivity goals for 
interactive workloads, including web workloads and non- 
interactive ones. It provides soft real-time guarantees for 
VMs hosted on a single server machine. 
L. Grit et al [14] design a Winks scheduler to support 
a weighted fair sharing model for a virtual "cloud" com- 
puting utility, such as Amazon's EC2, where each request 
is for a lease of some specified duration for one or more 
virtual machines. The goal of the Winks algorithm is to 
satisfy these requests from a resources pool in a way that 
preserves the fairness across flows, while our work focuses 
on how to provision resources for heterogeneous workloads 
when they are consolidated on a Cloud site. M. Stillwell et 



al [47| proposes resource allocation algorithms for virtual- 
ized service hosting platforms, of which clustered servers 
run components of continuously running services, such as 
Web and e-commerce applications. In nature, they only 
consider homogeneous workloads. B. Sotomayor et al [36| 
present the design of a lease management architecture, 
which implements leases as virtual machines, to provide 
leased resources with customized application environments 
that only consider homogeneous workloads (only paral- 
lel batch jobs) mixed with best-effort lease requests and 
advanced reservation requests. VSched (as an OS-level 
scheduling mechanism) or Haizea (/or parallel batch jobs 
mixed with best-effort lease requests and advanced reserva- 
tion requests) can be used as a component of our system 
for specific workloads, when we consider to support more 
workloads. 



3. Diverse Runtime Environment Requirements 

In this section, we summarize several representative 
cases of runtime environment requirements on a Cloud site. 

• Case One: Some universities are trying outsourcing 
of HPC services, just taking in this way the role of 
job-execution service providers 3]. 

• Case Two: many small companies have reverted to 
hosting environments for deploying Web services so 
as to decrease cost. 

• Case Three: a large organization has two representa- 
tive departments: a batch queuing system for parallel 
batch jobs for the first department and a Web service 
infrastructure for the second one. Instead of operat- 
ing two DCSs, the organization either wants to con- 
solidate heterogeneous workloads on a private Cloud 
or resorts to a public Cloud. 

Three observations can be derived from the above three 
cases: 

1. There are three main user roles in the observed sys- 
tems: resource providers, service providers and end 
users. For example, in Case two, universities play the 
role of service providers, and they want to outsource 
resources to a resource provider and run batch queue 
systems for end users, including graduate students or 
researchers. 

2. A resource provider does need to provision resources 
and create runtime environments for heterogeneous 
workloads. For example, when the organization in 
Case Three chooses a private Cloud or resorts to a 
public Cloud, or two service providers in Case one 
and Case two resort to a public Cloud, a resource 
provider requires provisioning two different runtime 
environments for heterogeneous workloads. 

3. For heterogeneous workloads, runtime environment 
requirements are dramatically different. Coordinated 
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resource provisioning for heterogeneous workloads 
may bring benefits to service providers and resource 
providers. 

For example, runtime environments for parallel batch 
jobs and Web services differ in four ways: 

• Workloads are different. Web service workloads are 
often composed of a series of requests; while paral- 
lel batch job workloads are composed of a series of 
submitted jobs, and each job is a parallel or serial 
application. 

• Resource consumptions are different. Running a par- 
allel application needs a group of exclusive resources. 
While for Web services, requests will be serviced si- 
multaneously and inter leavedly through multiplex use 
of resources. 

• Performance goals are different. From perspectives of 
end users, for parallel batch jobs, in general submitted 
jobs can be queued when resources are not available. 
However, for Web services like Web servers or search 
engines, each individual request needs an immediate 
response. 

• Time scales of management are different [s^l . Due to 
the nature of their performance goals and short du- 
ration of individual requests, Web services need au- 
tomation at short control cycles, e.g., seconds; How- 
ever, parallel batch jobs typically require calculation 
of a schedule for an extended period of time ^] , e.g., 
hours. 

When web service applications and parallel batch jobs 
are consolidated, we can propose coordinated resource pro- 
visioning solutions, since they have different performance 
goals. 

4. PhoenixCloud Design and Implementation 

In Section 14.11 we introduce the objectives of Phoenix- 
Cloud. Section 14.21 proposes a runtime environment spec- 
ification. In Section [4.31 we describe the architecture. 

Objectives 
PhoenixCloud has several objectives: 

1. Responsibility division between a resource provider 
and service providers. In our system, a resource 
provider is responsible for creating, destroying run- 
time environments and provisioning resources to dif- 
ferent runtime environments on a Cloud site, while a 
service provider only focuses on providing service. 

2. Provisioning a runtime environment on a basis of 
a runtime environment specification. PhoenixCloud 
provides a runtime environment specification for a ser- 
vice provider to express runtime environment require- 
ments. According to a runtime environment specifi- 
cation, a runtime environment is provisioned on de- 
mand. 



3. Pluggable resources type [2l|. Similar to Shirako, pro- 
visioned resources will include servers, storages, and 
network resources. Presently, our system mainly facil- 
itates provisioning servers in the granularity of node 
or virtual machine. 

4. Coordinated resource provisioning for heterogeneous 
workloads. If allowed by service providers, Phoenix- 
Cloud supports coordinated resource provisioning for 
two heterogeneous workloads. 

PhoenixCloud evolves from our previous Phoenix sys- 
tem j40|. We have implemented PhoenixCloud on the 
Dawning 5000 cluster system, which is ranked as top 10 
of Top 500 super computers in November, 2008. It is ex- 
pected that PhoenixCloud will be deployed on the super 
computer-Dawning 6000 system in Shenzhen super com- 
puting center, China, in 2010. 

4-. 2. Runtime Environment Specification 

We present a runtime environment specification as a 
basis for provisioning a runtime environment or two co- 
ordinated runtime environments on demand. In our opin- 
ion, in addition to service-level agreements between service 
providers and end users, job definitions for computational 
applications and service definitions for web services, both 
a resource provider and a service provider need a runtime 
environment specification to express diverse runtime en- 
vironment requirements, on a basis of which, a resource 
provider can flexibly provision runtime environments on 
demand for service providers. Figure [T] shows the relation- 
ships of different agreements among different roles. 
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Figure 1: The agreements among a resource provider, ser- 
vice providers and end users. 

A runtime environment specification includes informa- 
tion as follows: 

1. Relationships between a service provider and a re- 
source provider. 

We support three different relationships: same or af- 
filiated or business. Same means that a single user 
plays the roles of both resource provider and service 
provider, which describes traditional DCS; affiliated 
means that a user playing the role of service provider 
is affiliated to a user playing the role of resource 
provider, which can describe Case Three in Section [31 
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business means that a service provider has the busi- 
ness relationship with a resource provider, which can 
describe Case One or Case Two in Section [31 

2. Workload types. 

We supported two workloads types: parallel batch 
jobs and Web services, we are extending to support 
MapReduce and Dryad applications in Cloud. 

3. The allocation granularity of resources. 

We support resource allocation in the granularity of 
nodes or virtual machines like XEN. For virtual ma- 
chines, we provide predefined or user-defined virtual 
machine templates. For both nodes and virtual ma- 
chines, users need to specify customized operating sys- 
tem types and versions. 

4. Coordinated runtime environments. 

A service provider needs to decide two conditions: 
(a) whether a new runtime environment has a co- 
ordinated runtime environment that belongs to the 
same service provider; (b) Whether a service provider 
agrees that a new runtime environment is coordinated 
to share resources with other runtime environment of 
another service provider. 

5. Resource coordination models and bound sizes of re- 
sources. 

In each runtime environment, a service provider needs 
to specify two optional resource bounds: the lower re- 
source bound and the upper resource bound. The lower 
resource bound is rigid in that a resource provider will 
guarantee that resources within the limit of the lower 
resource bound will only be allocated to a runtime 
environment or its coordinated runtime environment. 
The upper resource bound is flexible in that resources 
within the range defined by the lower resource bound 
and the upper resource bound, which firstly satisfy re- 
source requests of the specified runtime environment 
or its coordinated runtime environments, can be re- 
allocated to another runtime environment when they 
are idle. Fig [5] shows the relationship between two 
bound sizes of resources. 

For two typical heterogeneous workloads: Web ser- 
vices and parallel batch jobs, we respectively propose 
a resource coordination model in two Cloud scenarios. 
In the private Cloud scenario, we presume that a re- 
source provider owns the fixed resources in a private 
Cloud that satisfy resource requests of two coordi- 
nated runtime environments. For this scenarios, we 
set the same size for both the lower resource bound 
and the upper resource bound. For two coordinated 
runtime environments, the size of the coordinated re- 
sources that are shared by two runtime environments 
is the sum of the lower resource bounds of two run- 
time environments. We call this model a FB (Fixed 
Bound) model. 

In the public Cloud scenario, we presume that a re- 
source provider owns enough resources that can sat- 
isfy resource requests of N service providers {N >> 
2). For a runtime environment, we only specify the 



lower resource bound size with the upper resource 
bound size undefined. Each runtime environment can 
request more resources beyond the limit of the lower 
resource bound. 

For coordinated resource provisioning, we choose the 
following principles: 

• If a service provider does not allow coordinated 
resource provisioning, a runtime environment 
will be provisioned independently. In this case, 
RightScale [32| and our previous effort [4ll re- 
spectively show individual Web service applica- 
tions or scientific computing workloads like par- 
allel batch jobs can benefit from the elastic man- 
agement of Cloud. 

• If allowed by service providers, we support co- 
ordinated resource provisioning at a granularity 
of a group, which is composed of two runtime 
environments respectively for two heterogeneous 
workloads. 

• For a runtime environment, if we can not find 
another coordinated one, we will independently 
provision that runtime environment. 

We call the above model a FLB_NUB (Fixed Lower 
Bound and No Upper Bound) model. 



Upper 
resource 
bound 



Lower 
resource 
bound 



Allocated to a RE 
or its coordinated RE, 
but will be reallocated 
: to another RE when 
they are idle 



Only allocated to a 
RE or its coordinated 
RE 



Figure 2: Two resource bound sizes. In this figure, RE 
stands for runtime environment 

6. The setup policy. 

A service provider determines when and how to per- 
form the setup work when resources are dynamically 
requested or released. The setup work includes pro- 
visioning operating systems and configuring applica- 
tions. For example, if the service provider pays high 
attention to the security of data, it may require wip- 
ing off the operating system and data on disks when 
a node is released to the resource provider. 

Fig 12] gives out a part of a runtime environment specifi- 
cation of parallel batch jobs for Case Three in Section ^ 
Our runtime environment specification is easily extensible, 
since we choose the XML (eXtensbile Markup Language) 
language to express it. 

4-3. PhoenixCloud architecture 

Layered architecture: PhoenixCloud follows a two- 
layered architecture: one is the common service frame- 
work (in short CSF) for a resource provider, and another 
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<runtime_environment_agreement name="userl" 
<relationship type=" business "> 

< / relationship> 

<workload type=" parallel_batch_jobs "> 
</workload> 

<environment type="coordinated" 
granularity="node " 

resource_coordination_mode="FLB_NUB" 
lo wer_bound_size=" 100" 
upper_bound_size=null 
<setup_policy ="NO" > 
</ environment > 

< / runtime_environment_agreement> 



Figure 3: A part of a runtime environment specification 



is the thin runtime environment software (in short, TRE) 
for a service provider. The two-layered architecture has 
two imphcations: first, there lies a separation between the 
CSF and a TRE. The CSF is provided and managed by 
a resource provider, independent of any TRE. With the 
support of the CSF, a TRE or two coordinated TREs can 
be created on demand for a service provider. Second, for 
heterogeneous workloads, the common sets of functions 
of runtime environments are delegated to the CSF, while 
a TRE only implements the core functions for a specific 
workload. 
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Figure 4: Interactions of three user roles in PhoenixCloud 

As shown in Fig HI there are three interacting user roles 
in PhoenixCloud: a resource provider, service providers 
and end users: 

• The CSF is running on the Cloud site. A resource 
provider is responsible for provisioning runtime envi- 
ronments with the support of the CSF. 

• The CSF provides a Web portal for a service provider 
to describe its runtime environment requirements. Af- 
ter a service provider has requested to create a run- 
time environment, the CSF is responsible for deploy- 
ing and starting a TRE. 

• After a service provider has activated its runtime en- 
vironment, a service provider has an associated Man- 
ager that monitors workload changes and resources 



status. Manager is a core component of a TRE. Each 
Manager requests or releases resources on behalf of a 
service provider according to load status and resources 
status. 

• After a runtime environment is providing service, end 
users use Web Portal to submit jobs or send requests. 

The advantages of separating the CSF and a TRE 
are twofold: first, developing a new TRE for different 
workloads is lightweight, since many common functions 
have been implemented in the CSF. Secondly, creating a 
TRE on demand is lightweight, since the CSF is ready 
and running before any TRE is created. 

Main components of the CSF: The major compo- 
nents of the CSF are as follows: 

1. Lifecycle Management Service is responsible for man- 
aging the lifecycle of a TRE. 

2. Resource Provision Service is responsible for provi- 
sioning resources to a TRE. 

3. Virtual Machine Provision Service is responsible for 
managing the lifecycle of a virtual machine, such as 
creating or destroying virtual machine, like XEN. 

4. Deployment Service is a collection of services for de- 
ploying and booting the operating system, the CSF 
and TREs. Major services include DHCP, TFTP, and 
FTP. 

5. Agent on each node is responsible for discovering node 
resources, such as CPU information, memory size and 
operating system version; downloading required soft- 
ware packages; starting or stopping service daemons, 
and transferring data. 

6. There are two types of monitors: Resource Monitor 
and Application Monitor. Resource Monitor on each 
node monitors usages of physical resources, e.g. CPU, 
memory, swap, disk I/O and network I/O; Application 
Monitor monitors application status. 

7. Process Management Service is responsible for 
starting, signaling, killing, and monitoring paral- 
lel/sequential jobs. 

Main components of a TRE: There are three 
components in a TRE: Manager, Scheduler and Web 
Portal. Manager is responsible for dealing with users' 
requests, managing resources and interacting with 
the CSF. Scheduler is responsible for scheduling the 
user's job or distributing user requests. Web Portal is 
the GUI through which end users submit and moni- 
tor jobs or applications. When a TRE is created, a 
configuration file will describe their dependencies. The 
details can be found in Section 4 of our previous work (43| . 

The customized policies of the CSF and a TRE: 

Fig[5] shows the major components and their extension 
points for the management mechanisms. 
Specified for Resource Provision Service, a resource pro- 
vision policy determines when Resource Provision Service 



6 



provisions how many resources to a TRE or how to co- 
ordinate resources between two coordinated runtime envi- 
ronments; a setup policy determines when and how to do 
the setup work, such as wiping off the operating system or 
doing nothing. 

Specified for Manager, a resource management policy de- 
termines when Manager requests or releases how many 
resources from or to the resource provision service accord- 
ing to what pohcy. 

For different workloads, the scheduling policy has different 
implications. For parallel batch jobs, the scheduling policy 
determines when and how the scheduler chooses parallel 
jobs for running. For Web service, the scheduling policy 
includes two policies: the instance adjustment policy and 
the request distribution policy. The instance adjustment 
policy decides when the number of Web service instances 
is adjusted to what an extent, and the request distribution 
policy decides how to distribute requests according to what 
criteria. 



Web Portal 
of Resource Provider 



I 



Lifecycle 
Management Service 



Resource 
\ Provision Policy 

Resource 
Provision Service 

\ Setup Policy 




CSF 



Figure 5: The summary of interactions and extension 
points for management mechanisms of PhoenixCloud. 
Number 1 indicates creating, destroying, activating and 
deactivating a TRE; Number 2 indicates requesting and 
releasing resources; Number 3 indicates proactively provi- 
sioning resources. 



matching scale. 

WS Manager interacts with Load Balancer (a type of 
Scheduler)to set its request distribution policy. WS 
Manager registers IP and port information of Web 
service instances to Load Balancer that is responsible 
for assigning workload to Web service instances, and 
Load Balancer distributes requests to Web services 
instances accordin g to the request distribution policy. 
We integrate LVS as the IP-level load balancer. 

Monitor on each node periodically checks resources 
utilization rates and reports to WS Manager. If the 
threshold performance value is exceeded, e.g., the 
average of utilization rates of CPUs consumed by 
instances exceeds 80%, WS Manager adjusts the 
number of Web service instances according to the 
instance adjustment policy. 



According to current Web service instances, WS 
Manager requests or releases resources from or to 
Resource Provision Service. 



The interactions of a PBJ TRE with the CSF are 
explained as follows: 

• Scheduling events tell PBJ Manager to send schedul- 
ing commands to Scheduler. Scheduling events 
include the timer registered by the administrator and 
new job arrivals. 



Scheduler requests jobs and nodes information from 
PBJ Manager, and takes decisions to run jobs 
according to a scheduling policy. 



Driven by periodical timers, PBJ Manager scans 
jobs in queue. If the threshold values defined in 
a resource management policy are exceeded, PBJ 
Manager will request or release resources from or to 
Resource Provision Service. 



Interactions of a TRE with the CSF: In the rest of 
this paper, we call a TRE for parallel batch jobs as PBJ 
TRE; we call a TRE for Web service as WS TRE. FigE) 
shows the interactions between TREs and the CSF in two 
coordinated runtime environments. 

The interaction of a WS TRE with the CSF is explained 
as follows: 



WS Manager obtains resources with the size of 
the lower resource bound from Resource Provision 
Service, and runs Web service instances with a 



Lifecycle management of a TRE: A traditional 
runtime environment is self-contained. PhoenixCloud 
facilitates creating a TRE on demand. Each TRE 
has three states: uninitialized, created and running. 
The uninitialized state indicates the nascent state of 
a TRE. The created state implies that a TRE for 
the specific workload is configured and deployed on a 
Cloud site. The running state indicates two meanings: 
first, resources with the lower bound size are allocated to 
a TRE; secondly, a TRE is providing services to end users. 
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Requests 




Figure 6: Interactions of a PBJ TRE and a WS TRE with 
the CSF. 



By taking the runtime environment specification of Fig 13] 
as an example, we introduce the major interactions in the 
lifecycle management as follows: 

1. Through Web Portal of a resource provider, a service 
provider creates its account, and then defines its run- 
time environment requirements 

2. Through Web Portal of a resource provider, a ser- 
vice provider sends a message of creating a runtime 
environment to Lifecycle Management Service. Then 
Lifecycle Management Service marks the state of the 
new runtime environment as uninitialized. 

3. Lifecycle Management Service sends a message of de- 
ploying a runtime environment to Agents on related 
nodes, which requests Deployment Service to down- 
load the required software package of the new TRE. 
After the new TRE is deployed, Lifecycle Manage- 
ment Service marks its state as created. 

4. A service provider sends a message of activating a 
runtime environment to Lifecycle Management Ser- 
vice through Web Portal of a resource provider. 

5. Lifecycle Management Service sends the configuration 
information of a new TRE to Resource Provision Ser- 
vice, including the lower resource bound and the up- 
per resource bound, the resource provision model, the 
setup policy. For a new PBJ TRE, Resource Provision 
Service will search a WS TRE from another service 
provider for coordinated resource provisioning if a ser- 
vice provider allows it, but does not run another web 
service workloads. 

6. Lifecycle Management Service sends a message of 
starting components of the new TRE to Agents. When 



the components of the new TRE are started, the com- 
mand parameters will tell the components what poli- 
cies should be taken. Then Lifecycle Management 
Service marks the state of the new TRE as running. 

7. Before resources are provisioned to the new TRE, the 
setup policy is triggered by Resource Provision Ser- 
vice. When the setup work is performed. Resource 
Provision Service notifies Manager that resources are 
ready. 

8. The new TRE begins providing service to end users 
through Web Portal of the service provider. 

9. According to load status. Manager dynamically re- 
quests or releases resources, which will also trigger 
the setup policy. 

To save the space, we omit the processes of deactivating 
and destroying a TRE. 

The advantage of PhoenixCloud: The advantages 
of our system are twofold. First, our system facilitates a 
service provider to express diverse runtime environment 
requirements, and enables creating runtime environments, 
especially coordinated runtime environments, on demand. 
With a runtime environment specification as a basis, our 
system can adapt to different cases without an architec- 
tural change. For example, our system can adapt to three 
cases in Section [3] Second, our system supports coordi- 
nated resource provisioning for heterogeneous workloads, 
and our experiments in Section [5] will show this benefit. 

5. RESOURCE COORDINATION AND MAN- 
AGEMENT POLICIES 

In this section, we respectively propose policies for FB 
and FLB-NUB models in consolidating two typical hetero- 
geneous workloads: Web services and parallel batch jobs. 

5.L The FB policy 

We propose the FB resource coordination pohcy as fol- 
lows: 

1. In creating two coordinated runtime environments (a 
PBJ TRE and a WS TRE) for two heterogeneous 
workloads, service providers specify the same value 
for the lower resource bound and the upper resource 
bound for each runtime environment. 

2. Resource Provision Service allocates resources with 
the sizes of the lower resource bounds to two TREs at 
their startups. The size of coordinated resources that 
are shared by two coordinated runtime environments 
is the sum of the lower resource bounds of two runtime 
environments. 

3. Resource demands of a WS TRE have high priority 
than that of a PBJ TRE. If a WS TRE demands re- 
sources that can not be satisfied by Resource Provi- 
sion Service, the latter will force the a PBJ TRE to 
release resources with the size required by a WS TRE, 
and then reallocate resources to a WS TRE. 



8 



4. Resource Provision Service registers a periodical 
timer (a time unit of leasing resources) for checking 
idle resources within the limit of the size of coordi- 
nated resources per time unit of leasing resources. 
For example, in EC2, the minimal time unit of leasing 
resources is one hour. If there are idle resources, 
Resource Provision Service will provision all idle 
resources to a PBJ TRE. 

For the above resource provision policy, the matched 
resource management policy of a PBJ TRE is as follows: 

1. PBJ Manager receives the resources provisioned by 
Resource Provision Service. 

2. If Resource Provision Service forces PBJ Manager to 
return resources, the latter will release resources with 
the required size. If there are no enough idle resources 
in PBJ Manager, it will kill jobs from the beginning 
of the minimum job size in turn, and then release 
resources with the required size. If there are more 
than one running jobs with the same job size, the job 
with the latest starting time will be killed firstly. 

In the rest of this paper, we call the above policies as 
NLB-NUB pohcies. In a recent work of USENIX 09 ATC, 
W. Zhang et al [42] argue that in managing web services of 
data centers, actual experiments are cheaper, simpler, and 
more accurate than models for many management tasks. 
We also hold the same position. In Section [^31 we will ex- 
plain how to obtain the management policies for a specific 
web service through real experiments. 
In the rest of this paper, we call the above policies as FB 
policies. 

5.2. The FLB-NUB policy 

We propose the FLB-NUB resource coordination policy 
as follows: 

1. In creating two coordinated runtime environments, 
service providers only specify the lower resource 
bound size for each runtime environment with the up- 
per resource bound size undefined. 

2. Resource Provision Service allocates resources with 
the lower bound sizes to a PBJ TRE and a WS TRE 
at their startups. 

3. Resource Provision Service registers a periodical 
timer (a time unit of leasing resources) for checking 
idle resources within the limit of the size of coordi- 
nated resources per time unit of leasing resources. If 
there are idle resources. Resource Provision Service 
will provision all idle resources to a PBJ TRE. 

4. If a WS TRE demands resources. Resource Provision 
Service will allocate enough resources. 

For the above resource provision policy, the matched re- 
source management policy of a PBJ TRE is as follows: 
We define the ratio of adjusting resource as the ratio of 
the accumulated resource demands of all jobs in queue to 



the current resources owned by a TRE. When the ratio of 
adjusting resource is greater than one, it indicates that for 
immediate running, some jobs in the queue need more re- 
sources than that currently owned by a TRE. 
We set two threshold values of adjusting resources, and 
respectively call them the threshold ratio of requesting re- 
source and the threshold ratio of releasing resource. 
The process of requesting and releasing resource are as 
follows: 

1 . PBJ Manager registers a periodical timer (a time unit 
of leasing resources) for adjusting resources per time 
unit of leasing resources. Driven by the periodical 
timer, PBJ Manager scans jobs in queue. 

2. If the ratio of adjusting resources exceeds the thresh- 
old ratio of requesting resource, PBJ Manager will 
request resources with the size of DRl as follows: 
DRl =(the accumulated resources demand of all jobs 
in the queue) — (the current resources owned by a PBJ 
TRE) 

3. If the ratio of adjusting resource does not exceed the 
threshold ratio of requesting resources, but the ratio 
of the resource demand of the present biggest job in 
queue to the current resources owned by a TRE is 
greater than one, PBJ Manager will request resources 
with the size of DR2: 

DR2 = (resources needed by the present biggest job 
in queue)— (the current idle resources owned by a 
TRE) 

When the ratio of the resources demand of the present 
biggest job in the queue to the current resources owned 
by a TRE is greater than one, it implies that the 
largest job will not run without available resources. 

4. If the ratio of adjusting resources is lower than the 
threshold ratio of releasing resources, PBJ Manager 
will releases idle resources with the size of RSS 
(Releasing Size). 

RSS = (the elastic factor) x (idle resources owned by 
PBJ TRE), where < (the elastic factor) < 1 

5. If Resource Provision Service proactively provisions 
resources to PBJ Manager, the latter will receive re- 
sources. 

6. PERFORMANCE EVALUATIONS 

In this section, for Web services and parallel batch jobs, 
we compare the performance of PhoenixCloud, DCS and 
EC2+RightScale. 

6.1. Evaluation metrics 

For parallel batch jobs, the metrics are as follows: 
we choose the well known metrics- the throughput in terms 
of the number of completed jobs [llj to reflect the ma- 
jor concern of a service provider. We use the average 
turnaround time per job to measure the main concern of 
end users. The average turnaround time of jobs is the time 
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from submittin g a job till completing it, averaged over all 



jobs submitted [2 



For Web service, the metrics are as follows: we choose the 
well-know metrics, the throughput in terms of requests per 
second to reflect the major concern of a service provider 
0] [lo| . For end users, we choose the average response 
time per requests to measure the quality of service, which 
reflects the major concern of end users ■ 
For two consolidated workloads, we choose the total re- 
source consumption in terms of node x hour to evaluate 
the effectiveness of coordinated resource provisioning. We 
specially care about the peak resource consumption that 
is the peak value of the resource consumption in terms of 
nodes, since it is a key factor in the capacity planning of 
the system for a resource provider. For the same workload, 
if the peak resource consumption of a system is higher, the 
capacity planning of a system is more difficult. 
We use the accumulated times of adjusting resources to 
evaluate the management overhead of a system, since each 
event of requesting, releasing or provisioning resources will 
trigger a setup action, for example wiping off the operat- 
ing system or data. The accumulated times of adjusting 
resources are the times of resources being dynamically re- 
quested, released or provisioned when a runtime environ- 
ment is providing services. 

All performance metrics are obtained in the same period 
that is the duration of workload traces. 

6.2. Workload traces 

1. The workload traces of parallel batch jobs 

We choose two typical workload traces from [31]. The 
utilization rate of all traces in ^Sl] varies from 24.4% 
to 86.5%. We choose one trace with lower load-NASA 
iPSC trace (46.6% utilization) and one trace with 
higher load-SDSC BLUE trace (76.2% utilization). 
NASA iPSC is a real trace segment of two weeks from 
Oct 01 00:00:03 PDT 1993. For NASA iPSC trace, 
the configuration of the cluster system is 128 nodes. 
SDSC BLUE is a real trace segment of two weeks from 
Apr 25 15:00:03 PDT 2000. For SDSC BLUE trace, 
the cluster configuration is 144 nodes. 

2. Web service workload 

For Web service, we choose a real workload trace, the 
World Cup workload trace f3| from June 7 to June 20 
in 1998. The World Cup workload indeed reflects the 
nature of a web service workload, of which the ratio 
of peak load to normal load is high. Through Google 
scholar search, we can find that this workload is cited 
frequently (488 cites in June, 3, 2010) in the related 
research community. 

6.3. Experiment methods 

To evaluate and compare the dedicated cluster system 
system, PhoenixCloud, and EC2-t-RightScale, we adopt 
the following experiments methods. 



The real experiments of World Cup workload. 
For web service, we obtain a resource consumption 
trace through the real experiments that deploys a WS 
TRE for the World Cup workload. 

The simulated experiments of consolidating two 
heterogeneous workloads. 

The period of a typical workload trace is often 
weeks, or even months. To evaluate a system, many 
key factors have effects on experiment results, and 
we need to frequently perform time consuming 
experiments. So we use the simulation method to 
speedup experiments. We speed up the submission 
and completion of jobs by a factor of 100. This 
speedup allows two weeks trace to complete in about 
three hours. 



The simulated clusters. 

The workload traces are obtained from platforms 
with different configurations. For example, A^^^^ 
iPSC is obtained from the cluster system with each 
node composed of one CPU; SDSC BLUE is obtained 
from the cluster system with each node composed 
of eight CPU; The World Cup resource consump- 
tion trace is obtained from the four-core Intel(R) 
Xeon(R) platform; In the rest of experiments, our 
simulated cluster system is modeled after the NASA 
iPSC cluster, comprising only single- CPU nodes. 
So we divide the workload trace of SDSC BLUE by 8. 



Synthetic heterogeneous workloads. 
To the best of our knowledge, the real traces of paral- 
lel batch jobs and Web service on the same platform 
are not available. However, the focus of our paper is to 
simulate the case of consolidating two heterogeneous 
workloads with different peak resource demands on a 
Cloud site. So in our experiments, on a basis of work- 
load traces introduced in Section 16.21 we scale two 
heterogeneous workload traces with different constant 
factors. We propose a tuple of {PRCpbj , PRCws) 
to represent two synthetic heterogeneous workload 
traces, where PRCpBj is the peak resource demand 
of parallel batch job trace and PRCws is the peak 
resource demand of Web service trace. For example, 
a tuple of (100,60) that is scaled on a basis of SDSC 
BLUE and World Cup traces means that we respec- 
tively scale SDSC BL UE and World Cup traces with 
two different constant factors, and on the same sim- 
ulated cluster system, the peak resource demand of 
SDSC BL UE and World Cup is respectively 100 nodes 
and 60 nodes. 

The testbed. 

Shown in Fig [71 the testbed includes two types of 
nodes, nodes with the name starting with glnode and 
nodes with the name starting with ganode. The 
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nodes of glnode have the same configuration, and each 
node has 2G memory and two CPUs. Each CPU of 
the node of glnode has four cores, Intel(R) Xeon(R) 
(2.00GHz). The OS is 64-bit Linux with kernel of 
2.6.18-xen. The nodes of ganode have same configu- 
ration, and each node has IG memory and 2 CPUs, 
AMD Optero242 (1.6GHz). The OS is 64-bit Linux 
with kernel version of 2.6.5-7.97-smp. All nodes are 
connected with a 1 Gb/s switch. 



Ganode002 




GlnodeOOa 




IGb/s swrtch 




Ganode003 




Glnode... 








Ganode004 




GanodeOie 



Figure 7: The testbed. 



6.4^. The real experiments of World Cup workload 

On each node of glnode, we deploy eight XEN [3§| 
virtual machines. For each XEN virtual machine, one 
core and 256M memory is allocated, and the guest 
operating system is 64-bit CentOS with kernel version of 
2.6.18-XEN. 

On the testbed, we deploy a WS TRE shown in Fig. El 
In the experiments. Load Balancer is LVS ^27.] with direct 
route mode llSl. 




respectively 0.9.0, 1.24 and 1.4.5. Two httperf instances 
are deployed on ganode002 and ganodeOOS. 
The Web workload trace is obtained from the World 
Cup workload trace Q with a scaling factor of 2.22. 
The experiments include two steps. First, we decide 
the instance adjustment policy; secondly, we obtain the 
resource consumption trace. 

In the first step, we deploy PhoenixCloud with the in- 
stance adjustment policy disabled. For this configuration, 
WS Manager will not adjust the number of Web service 
instances. On the testbed of 16 virtual machines, 16 in- 
stances of ZAP! are deployed with each instance deployed 
on each virtual machine. When httperf generates different 
scale of load, we record the actual throughput, the average 
response time and the average utilization rate of CPU 
cores. Since one CPU core is allocated to one virtual 
machine, for virtual machine, the number of VCPUs is 
number of CPU cores. So the average utilization rate 
of each CPU core is also the average utilization rate of 
VCPUs. FigO shows the relationship between the actual 
throughput and average utilization rate of VCPUs. From 
Fig[51 we observe that when the average utilization rate 
of VCPUs is below 80%, the average response time of 
requests is less than bQmilli seconds. However, when the 
average utilization rate of VCPUs increases to 97%, the 
average response time of requests dramatically increase to 
\b2Smilli seconds. 
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The average utilization rate of VCPUs 
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Figure 9: Throughput and average response time V.S. 
number of virtual machines. 



Figure 8: Relationship between actual throughput and av- 
erage utilization rate of VCPUs on the testbed of 16 virtual 
machines. 

Each Agent and Monitor are deployed on each vir- 
tual machine. LVS and other services are deployed on 
ganode004, since all of them have light load. We choose 
the least-connection scheduling policy [liS to distribute 
requests. We choose httperf [17i] as load generator and 
open source application ZAP! [l9| as the target Web 
service. The versions of httperf, LVS and ZAP! are 



Based on the above observation, we choose the average 
utilization rate of VCPUs as the criterion of adjusting the 
number of instances of ZAP!, and set 80% as the threshold 
value. 

For ZAP!, we specify the instance adjustment policy as 
follows: the initial number of Web service instances is 
two. If the average utilization rate of VCPUs consumed 
by all instances of Web service exceeds 80% in the past 
2Qseconds, WS Manager will add one instance. If the av- 
erage utilization rate of VCPUs, consumed by the current 
instances of Web service, is lower than (80%(22^)) in the 



past 20seconds, and n is the number of current instances, 
WS Manager will decrease one instance. 
In the second step, we deploy PhoenixCloud with the 
above instance adjustment policy enabled. WS Manager 
adjusts the number of Web service instances according to 
the instance adjustment policy. In the experiments, we 
also record the relationship between the actual through- 
put, the average response time and the number of virtual 
machine. 

From Fig [21 we observe that for different number of VMs, 
the average response time is below 700 milliseconds and 
the throughput increases linearly when the number of VM 
increases. This indicates that the instance adjust policy is 
appropriate, may not optimal. 

With the above policies, we obtain the resource consump- 
tion trace in two weeks. FigfTUl shows the World Cup re- 
sources consumption trace, of which the peak resources 
demand is 6AVAI. 
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Figure 10: The World Cup resource trace in two weeks. 



In the following simulation experiments, if PRCws is 
the same in different {PRCpbj, PRCws) tuples, we use 
the same World Cup resource trace as the input of Web 
services in DCS, PhoenixCloud and EC2+RightScale. 

6.5. Simulation Experiments of dedicated cluster system 
and PhoenixCloud 
In this section, we compare DCS and PhoenixCloud in 
the private Cloud scenario that a resource provider owns 
the fixed resources that satisfy resource requests of two 
runtime environments for heterogeneous workloads 

6.5.1. The simulated systems 
• The simulated dedicated cluster system 

Since the configuration of DCS is decided by the peak 
resource demand of a workload for a workload tuple 
[PRCpBj, PRCws), we presume that the configura- 
tion size of the simulated cluster system is the sum 
of PRCpBj and PRCws, which is also the smallest 
valid configuration size. Fig llll shows the simulated 
dedicated cluster system. Resources are statically al- 
located to two runtime environments: PRCpBj size 
for a PBJ RE and PRCws size for a WS RE. The job 
simulator is used to simulate the process of submitting 
job. 
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Figure 11: The simulated DCS. 



The simulated PhoenixCloud system 
For a workload tuple {PRCpbj, PRCws), in 
PhoenixCloud, we presume that the bound of the 
configuration size of the simulated cluster system 
is the sum of PRCpbj and PRCws- However, 
the configuration size of the simulated cluster may 
decrease to a lower value. 
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Figure 12: The simulated PhoenixCloud system. 

In comparison with the real PhoenixCloud system in 
Fig El our emulated PhoenixCloud in Fig [1^ keeps Re- 
source Provision Service, PBJ Manager, WS Manager 
and Scheduler, while other services are removed. For 
a WS TRE, the resource simulator simulates the vary- 
ing resources consumption and drives WS Manager to 
request or release resources from or to Resource Pro- 
vision Service. 

6.5.2. Experiment configurations 

• The resource coordination and management policy. 
For DCS, resources are statically allocated to a run- 
time environment. PhoenixCloud adopts the FB pol- 
icy. 

• The scheduling policy. A dedicated cluster system 
and PhoenixCloud adopt the same first-fit scheduling 
policy for parallel batch jobs. The first- fit schedul- 
ing policy scans all the queued jobs in the order of 
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job arrival and chooses the first job, whose resources 
requirement can be met by the system, to execute. 

6.5.3. Simulation Experiment Results 

Table [T] and Table [5] respectively summarize experiment 
results for NASA iPSC-|- World Cup, of which the tuple of 
peak resource demands {PRCpbj , PRCws) is (128, 128), 
and SDSC BLUE+World Cup, of which the tuple of peak 
resource demands {PRCpbj, PRCws) is (144,128). In 
the rest of this section, we will investigate the effect of 
varied tuple of peak resource demands. 

Table 1: Metrics of DCS and PhoenixCloud for NASA 
IPSC+WORLD CUP 
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PhoenixCloud 
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(256) 









DCS (BLUE-I- World Cup) or close to that of DCS(iPSC 
-hWorldCup). 

Second, when the throughput is almost same like that of 
DCS with a small amount of delay in terms of the average 
turnaround time (maximally by 38%), the configuration 
size of the simulated cluster system can be decreased by 
about 40% for two groups of heterogeneous workloads. 
This is because: (a) for both Web service and parallel 
batch jobs, ratios of peak loads to normal load are high. 
However, peak loads of two traces have different timing; 
(b) when Web service has a short spike, the FB policy will 
kill running jobs with the smallest resource demands, so we 
can decrease the configuration size of cluster system, but 
at the same time increase the average turnaround time. 
When PRCpBj is the same. Table [3] and Table g] show 
the effect of different ratios of PRCws to PRCpBj on 
the performance metrics of PhoenixCloud. 



Table 3: Metrics of PhoenixCloud for iPSC+WorldCup. 
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Table 2: Metrics of DCS and PhoenixCloud for SDSC 
BLUE+WORLD CUP. 
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From Table [T] and Table ^ we can observe two facts: 
first, using the FB policy in PhoenixCloud, when the con- 
figuration size of the simulated cluster is no more than 
85% of that of DCS, the throughput in terms of the 
number of completed jobs of PhoenixCloud is higher than 
that of DCS (BLUE+WorldCup) or same like that of 
DCS(iPSC +WorldCup); at the same time, the average 
turnaround time of PhoenixCloud is better than that of 



Table 4: Metrics of PhoenixCloud for BLUE+WorldCup. 
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From Table [3] and Table HI we can observe that when two 
peak resource demands in {PRCpbj, PRCws) are close, 
the percent of saved resources, which is obtained with the 
smallest configuration size of cluster, outperforms other 
cases. This is because when we consolidate two heteroge- 
neous workloads, the configuration size of PhoenixCloud 
must be greater than the maximum value of two peak re- 
source demands. For parallel batch jobs, if the configura- 
tion size of cluster is less than the resource demand of the 
biggest job, the biggest job can not run. For Web service, 
if the configuration size of cluster is less than the peak 
resource demand, overload will happen. 
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6.6. Simulation Experiments of EC2+RightScale and 
PhoenixCloud 

In this section, we compare the performance of Phoenix- 
Cloud and EC2+RightScale in the public Cloud scenario. 
We presume that the simulated cluster system has abun- 
dant resources with respect to resource requests of two 
heterogeneous workloads in both systems. 

6.6.1. The simulated systems 

• The simulated EC2+RightScale system 

Because RightScale provides the same scalable man- 
agement for Web service as PhoenixCloud, we just 
use the same resource consumption trace for Web 
service in two systems, which is obtained in Section 
16.41 For parallel batch jobs, in EC2, end users simul- 
taneously request resources needed by parallel batch 
jobs, and the submitted jobs will run immediately, 
so there is no need for Scheduler. Fig. [TSl shows the 
simulated architecture of EC2+RightScale. 

• The simulated PhoenixCloud 

The simulated PhoenixCloud is same as that shown 
in Fig HI] but with the FLB-NUB policy. 




Resource Provision 
Service 



Figure 13: The simulated system of EC2-I- RightScale. 



6.6.2. Experiment configurations 
• The resource coordination policy. For PhoenixCloud, 
we adopt the FLB-NUB pohcy. For EC2-I- RightScale, 
there is no resource coordination between two run- 
time environments. 



The scheduling policy of parallel batch jobs. Phoenix- 
Cloud adopt the first-fit scheduHng policy. EC2 needs 
no scheduling policy, since it is each end user that is 
responsible for running parallel batch jobs. 



The resource management policy. For both systems, 
there is a time unit of leasing resources. We pre- 
sume that the lease term of a resource is a time 
unit of leasing resource times a positive integer. In 



the EC2-t- RightScale solution, for parallel batch jobs, 
each end user is responsible for manually managing 
resources on EC2 system, and we presume that a 
user only releases resources at the end of each time 
unit of leasing resources if a job runs over. This is 
because: a) EC2 charges the usage of resources in 
terms of a time unit of leasing resources (an hour); b) 
It is difficult for end users to predict the completed 
time of jobs, and hence releasing resources to resource 
provider on time is almost impossible. PhoenixCloud 
adopts the FLB-NUB pohcy. 

6.6.3. Experiment Results 

Before reporting experiment results, we pick the 
following parameters as the baseline configuration of 
PhoenixCloud for comparison, and detailed parameter 
analysis will be deferred to Section 16.6.41 
Through comparisons with a large amount of ex- 
periments, we set the baseline parameters in 
PhoenixCloud: [B25/[/1.2/F0.2/G0.5] for iPSC-t- World 
Cup and [B27/C/1.2/1/0.2/G0.5] for SDSC-|-WorldCup, 
where [B25/C/1.2/l/0.2/G'0.5] indicates that the size of 
coordinated resources (which is represented as B) is 25 
nodes, the threshold ratio of requesting resources (which is 
represented as U) is 1.2; VQ.2 indicates that the threshold 
ratio of releasing resources (which is presented as V) is 
0.2; GO. 5 indicates that the elastic factor of releasing 
resources (which is represented as G) is 0.5. In both 
two systems, the time unit of leasing resources (which is 
represented as L) is 60 minute. 

Table |5] and Table [S] respectively summarize the 
experiment results for iPSC-fWorldCup, of which 
{PRCpB.j,PRCws) is (128, 128), and BLUE-|-WorldCup 
traces, of which is {PRCpbj, PRCws) is (144, 128). 
From two tables, we can observe two facts: 

(1) The total resource consumption of PhoenixCloud is 
less than that of EC2-f'RightScalc (maximally by 28% 
and minimally by 14%) with a small amount of delay in 
terms of average turnaround time per jobs (maximally by 
44% and minimally by 35%); 

(2) PhoenixCloud decreases the peak resource con- 
sumption maximally to 31% with respect to that of 
EC2-|-RightScale. This is because PhoenixCloud only 
requests resources on the condition that the threshold 
ratio of requesting resources is exceeded, or else jobs will 
be queued, so PhoenixCloud decreases peak resource con- 
sumption and total resource consumption, and increases 
the average turnaround time. 

When PRCpBj is the same. Table |7] and Table [5] show 
the effect of different ratios of PRCws to PRCpbj on 
the performance metrics of PhoenixCloud. Due to the 
space limitation, we constrain most of our discussion 
to the configuration of BR0.1_U1.2_V0.2_C0.5_L60, 
where BRO.l indicates the ratio of the size of the coor- 
dinated resources of PhoenixCloud to the sum of PRCws 
and PRCpBj is 0.1. 

From Table [7] and Table [HI we can observe that when the 
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Table 5: Metrics of EC2+RIGHTSCALE and Phoenix- 
Cloud for IPSC +WORLDCUP. 



Table 8: Metrics of PhocnixCloud for BLUE+WorldCUP. 
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Because of space limitation, we are unable to present the 



Table 6: Metrics of EC2-HRIGHTSCALE and Phoenix- 
Cloud for SDSC-t-WORLDCUP. 
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ratio of PRCws to PRCpBj increases, the percent of 
saved resources (%) increases, which is obtained against 
the sum of PRCws and PRCpBj times the trace dura- 
tion. This observation is different from that of the FB 
policy in Section 16.5.31 this is because in the FLB-NUB 
policy, resources can be dynamically requested beyond the 
lower resource bound; while in the FB policy, the resources 
only can be dynamically requested within the limit of the 
lower resource bound. 

Table 7: Metrics of PhoenixCloud for iPSC +WorldCup. 



data for the effect of all parameters; instead, we constrain 
most of our discussion to the configuration that one pa- 
rameter varies while the other parameters keep the same 
values as those of the baseline configuration in Section 
16.6.31 which are representative of the trends that we ob- 



{PRCpBj, 


number of 


average 


average 


saved 


PRCws 


completed 


execu- 


turn 


resources 


) 


jobs 


tion time 


around 


(%) 






(seconds) 


time 










(sec- 










onds) 




(128,64) 


2603 


573 


839 


38.3% 


(128,128) 


2603 


573 


826 


46.8% 


(128,256) 


2603 


573 


839 


58.5% 



serve across all cases. 

The effect of the size of coordinated resources 

(B). To save space, in PhoenixCloud we tune B, while 
other parameters are [C/1.2/1/0.2/G0.5/L60]. FigfTH and 
Fig [T^] show the effect of different B values for two groups 
of heterogeneous workloads. In the rest of this sec- 
tion, tuples of {PRCpBj, PRCws) of iPSC-t-WorldCup 
and BLUE-l-WorldCUP are respectively (128,128) and 
(144,128). 

From Fig[T3] and FiglT^ we have the following observa- 
tions: 

1) With the increase of B, the total resource consumption 
increases, while the average turnaround time decreases. 
This is because resources under the lower resource bound 
are only allocated to PBJ TRE and WS TRE, and hence 
idle resources will also increase when B increases for the 
same workload; at the same time, with the increase of B, 
more resources will be provisioned to PBJ TRE, so the 
average turnaround time per jobs decreases. 

2) The change of B has small effect on the number of 
completed jobs. This is because PhoenixCloud can dy- 
namically request resources when the threshold ratio of 
requesting resource is triggered. 

The effects of the threshold ratios of request- 
ing resources and releasing resources ([/ and V) 
and the elastic factor of releasing resource (G). To 
save space, in PhoenixCloud we tune one of C/, V , G, 
while other parameters are [B25/U1.2/V0.2/G0.5/L60] 
for iPSC -l-WorldCup and [B27/C/1.2/y0.2/G0.5/L60]for 
BLUE -l-WorldCUP. FigdU and Figlfz] show the effect of 
different parameters. 

From Fig [TB] and Fig[T71 we have the following observa- 
tions: 1) U, V, G have small effect on the total resource 
consumption and the number of completed jobs when B is 
fixed. 2) G is proportional to the average turnaround time 
when B is fixed. This is because a larger elastic factor of 
releasing resources will result in less idle resources when 



15 



■total resource consumption 
peak resource consumption 




(a) iPSC+WorldCup 




(b) Blue+WorldCup 

Figure 14: Peak and total resource consumptions V.S. dif- 
ferent B. 
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(b) Blue+WorldCup 

Figure 15: The number of completed jobs and average 
turnaround time V.S. different B. 



new jobs are submitted. U and V have small effect on the 
average turnaround time. 

The effect of the time unit of leasing resources. 

We respectively set the time unit of leasing resources 
L as 15/30/60/120/240 minutes, while other parameters 
are [B25/f/1.2/y0.2/G0.5] for NASA iPSC workload and 
[B27/C/1.2/F0.2/G0.5] for SDSC BLUE workload. In 
FigUHl iPSC-15 implies that L is 15 minutes and work- 
load is iPSC. 

From FigllHl we have the following observation: 
1) The management overhead is inversely proportional to 
L. This is because when the time unit of leasing resources 
is less, the service provider requests resources more fre- 
quently. 

Taking it into account resources are charged at the gran- 
ularity of a time unit of leasing resources, we make a 
tradeoff and select L as 60 minutes in PhoenixCloud and 
EC2-t-RightScale. In fact, in EC2 system, resources are 
also charged at the granularity of one hour. 
Implications of Analysis. Based on the above analy- 
sis, we have the following suggestions in choosing factors 
for two coordinated runtime environments for Web service 
and parallel batch jobs: since the increase of B value will 
also result in the increase of total resource consumption. 



we suggest selecting a low value for B value: about 10% of 
the sum of PRCpBj and PRCws- Increasing the elastic 
factor of releasing resource G will result in the delay in 
terms of the average turnaround time. Our experiments 
show 0.5 makes a good compromise. According to our ex- 
periments, when U is greater than 1.0 and less than 2.0, it 
has a small effect on the metrics in our experiments; when 
V is greater than 0.1 and less than 0.5, it has a small effect 
on the metrics in our experiments. So we suggest service 
providers to choose the baseline configuration in Section 
EMlfor U, V, G. 

6. 7. Discussions 

Our experiments show that a service provider has three 
choices in consolidating heterogeneous workloads: 
1) If resorting to a private Cloud with the fixed size, 
it should choose PhoenixCloud with a FB policy. With 
this solution, the configuration size is smallest with re- 
spect to other three solutions. However, this solution in- 
creases both the average execution time and the average 
turnaround time, since jobs may be killed to reallocate 
resources to web services. 

In a public Cloud scenario, 2) If paying high atten- 
tion to the average turnaround time per jobs, it should 
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(b) Blue+WorldCup 

Figure 16: peak and total resource consumptions V.S. dif- 
ferent G, V, U. 
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Figure 17: the number of completed jobs and average 
turnaround time V.S. different G,V, V . 



choose the EC2+RightScale solution. Ho-wever, this solu- 
tion -will result in higher peak resource consumption, -which 
is several times (f-wo or three in our experiments) of that 
of PhoenixCloud, and a larger total resource consumption. 
3) If making a tradeoff among the resource consump- 
tion and the average turnaround time of jobs, it should 
choose PhoenixCloud with the FLB-NUB policy. With 
this solution, the total and peak resource consumptions of 
PhoenixCloud are smaller than that of EC2+RightScale, 
-while the average turnaround time is larger than that of 
EC2-t-RightScale with small delay. 

7. CONCLUSIONS 

In this paper, we presented a runtime environment spec- 
ification that expresses diverse runtime environment re- 
quirements and built an innovative system PhoenixCloud 
to enable creating coordinated runtime environments on 
demand for heterogeneous workloads in different Cloud 
scenarios. For two typical heterogeneous workloads: Web 
services and parallel batch jobs, we respectively proposed 
a coordinated resource provisioning solution in two differ- 
ent Cloud scenarios. 

For three typical workload traces: SDSC BLUE, NASA 



iPSC and World Cup, our experiments showed that: a) 
in the private Cloud scenario, when the throughput is al- 
most same like that of DCS, our solution decreases the 
configuration size of cluster by about 40%; b) in the pub- 
lic Cloud scenario, our solution decreases not only the to- 
tal resource consumption, but also the peak resource con- 
sumption maximally to 31% with respect to that of EC2 
-f RightScale solution. 
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Figure 18: management overhead V.S. different time unit 
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